Measuring the Quality of Approximated Clusterings
نویسندگان
چکیده
Clustering has become an increasingly important task in modern application domains. In many areas, e.g. when clustering complex objects, in distributed clustering, or when clustering mobile objects, due to technical, security, or efficiency reasons it is not possible to compute an “optimal” clustering. Recently a lot of research has been done on efficiently computing approximated clusterings. Here, the crucial question is, how much quality has to be sacrificed for the achieved gain in efficiency. In this paper, we present suitable quality measures allowing us to compare approximated clusterings with reference clusterings. We first introduce a quality measure for clusters based on the symmetric set difference. Using this distance function between single clusters, we introduce a quality measure based on the minimum weight perfect matching of sets for comparing partitioning clusterings, as well as a quality measure based on the degree-2 edit distance for comparing hierarchical clusterings.
منابع مشابه
انتخاب اعضای ترکیب در خوشهبندی ترکیبی با استفاده از رأیگیری
Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...
متن کاملOptimization and Simplification of Hierarchical Clusterings
Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a sys...
متن کاملComplete hierarchical cut-clustering: A case study on expansion and modularity
In this work we study the hierarchical cut-clustering approach introduced by Flake et al., which is based on minimum s-t-cuts. The resulting cut-clusterings stand out due to strong connections inside the clusters, which indicate a clear membership of the vertices to the clusters. The algorithm uses a parameter which controls the coarseness of the resulting partition and which can be used to con...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملEfficient Optimum Design of Steructures With Reqency Response Consteraint Using High Quality Approximation
An efficient technique is presented for optimum design of structures with both natural frequency and complex frequency response constraints. The main ideals to reduce the number of dynamic analysis by introducing high quality approximation. Eigenvalues are approximated using the Rayleigh quotient. Eigenvectors are also approximated for the evaluation of eigenvalues and frequency responses. A tw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005